Interchassis Session Recovery

Interchassis Session Recovery
 
This appendix describes how to configure Interchassis Session Recovery (ICSR). The product Administration Guides provide examples and procedures for configuration of basic services on the system. You should select the configuration example that best meets your service model, and configure the required elements for that model as described in the respective product Administration Guide, before using the procedures in this appendix.
Important: ICSR is a licensed Cisco feature that requires a separate license. Contact your Cisco account representative for detailed information on specific licensing requirements. For information on installing and verifying licenses, refer to the Managing License Keys section of the Software Management Operations chapter.
This appendix discusses the following:
Caution: ICSR should not be configured on chassis supporting L2TP calls.
Overview
The ICSR feature provides the highest possible availability for continuous call processing without interrupting subscriber services. ICSR allows the operator to configure geographically distant gateways for redundancy purposes. In the event of a node or gateway failure, ICSR allows sessions to be transparently routed around the failure, thus maintaining the user experience. ICSR also preserves session information and state.
ICSR is implemented through the use of redundant chassis. The chassis are configured as primary and backup, with one being active and one standby. Both chassis are connected to the same AAA server. A checkpoint duration timer controls when subscriber data is sent from the active chassis to the standby chassis. If the active chassis handling the call traffic goes out of service, the standby chassis transitions to the active state and continues processing the call traffic without interrupting the subscriber session.
The chassis determine which is active through a proprietary TCP-based connection known as the Service Redundancy Protocol (SRP) link. The SRP link is used to exchange Hello messages between the primary and backup chassis and must be maintained for proper system operation.
Important: Refer to the Product Overview to verify whether a specific service supports ICSR as an option.
Interchassis Communication
Chassis configured to support ICSR communicate using periodic Hello messages. These messages are sent by each chassis to notify the peer of its current state. The Hello message contains information about the chassis such as its configuration and priority. A dead interval is used to set a time limit for a Hello message to be received from the chassis’ peer. If the standby chassis does not receive an Hello message from the active chassis within the dead interval, the standby chassis transitions to the active state. In situations where the SRP link goes out of service, a priority scheme is used to determine which chassis processes the session. The following priority scheme is used:
Checkpoint Messages
Checkpoint messages are sent from the active chassis to the standby chassis. These messages are sent at specific intervals and contain all the information needed to recreate the sessions on the standby chassis, if that chassis were to become active. Once a session exceeds the checkpoint duration, checkpoint data is collected on the session.
AAA Monitor
AAA servers are monitored using the authentication probe mechanism. AAA servers are considered Up if the authentication-probe receives a valid response. AAA servers are considered Down when the max-retries count specified in the configuration of the AAA server has been reached. SRP will initiate a switchover when none of the configured AAA servers responds to an authentication probe. AAA probing is only performed on the active chassis.
Important: A switchover event caused by an AAA monitoring failure is non-revertible.
If the newly active chassis fails to monitor the configured AAA servers, it remains as the active chassis until one of the following occurs:
BGP Interaction
The Service Redundancy Protocol implements non-revertible switchover behavior via a mechanism that adjusts the route modifier value for the advertised loopback/IP Pool routes. The initial value of the route modifier value is determined by the chassis’ configured role and is initialized to a value that is higher than a normal operational value. This ensures that in the event of an SRP link failure and an SRP task failure, the correct chassis is still preferred in the routing domain.
The Active and Standby chassis share current route modifier values. When BGP advertises the loopback and IP pool routes, it converts the route modifier into an autonomous systems (AS) path prepend count. The Active chassis always has a lower route modifier, and thus prepends less to the AS-path attribute. This causes the route to be preferred in the routing domain.
If communication on the SRP link is lost, and both chassis in the redundant pair are claiming to be Active, the previously Active chassis is still preferred since it is advertising a smaller AS-path into the BGP routing domain. The route modifier is incremented as switchover events occur. A threshold determines when the route modifier should be reset to its initial value to avoid rollover.
Requirements
ICSR configurations require the following:
Redundancy – to configure the primary and backup chassis redundancy.
Source – AAA configuration of the specified nas-ip-address must be the IP address of an interface bound to an HA, or any core network service configured within the same context.
Destination – to configure monitoring and routing to the PDN.
Important: ICSR is a licensed Cisco feature. Verify that each chassis has the appropriate license before using the procedures in this appendix. To do this, log in to both chassis and execute a show license information command. Look for “Inter-Chassis Session Recovery”. If the chassis is not licensed, please contact your Cisco account representative.
 
Caution: ICSR should not be configured for chassis supporting L2TP calls.
The following figure shows an ICSR network.
ASR 5000 ICSR Network
ICSR Operation
This section shows operational flows for ICSR.
The following figure shows an ICSR process flow due to a primary failure.
ICSR Process Flow (Primary Failure)
The following figure shows an ICSR process flow due to a manual switchover.
ICSR Process Flow (Manual Switchover)
Chassis Initialization
When the chassis are simultaneously initialized, they send Hello messages to their configured peer. The peer sends a response, establishes communication between the chassis, and messages are sent that contain configuration information.
During initialization, if both chassis are misconfigured in the same mode - both active (primary) or both standby (backup), the chassis with the highest priority (highest number set with the ICSR priority command) becomes active and the other chassis becomes the standby.
If the chassis priorities are the same, the system compares the two MAC addresses and the chassis with the higher SPIO MAC address becomes active. For example, if the chassis have MAC addresses of 00-02-43-03-1C-2B and 00-02-43-03-01-3B, the last 3 sets of octets (the first 3 sets are the vendor code) are compared. In this example, the 03-1C-2B and 03-01-3B are compared from left to right. The first pair of octets in both MAC addresses are the same, so the next pairs are compared. Since the 01 is lower than the 1C, the chassis with the SPIO MAC address of 00-02-43-03-1C-2B becomes active and the other chassis the standby.
Chassis Operation
This section describes how the chassis communicate, maintain subscriber sessions, and perform chassis switchover.
Chassis Communication
If one chassis in the active state and one in the standby state, they both send Hello messages at each hello interval. Subscriber sessions that exceed the checkpoint session duration are included in checkpoint messages that are sent to the standby chassis. The checkpoint message contains subscriber session information so if the active chassis goes out of service, the backup chassis becomes active and is able to continue processing the subscriber sessions. Additional checkpoint messages occur at various intervals whenever subscriber session information is updated on the standby chassis.
Chassis Switchover
If the active chassis goes out of service, the standby chassis continues to send Hello messages. If the standby chassis does not receive a response to the Hello messages within the dead interval, the standby chassis initiates a switchover. During the switchover, the standby chassis begins advertising its srp-activated loopback and pool routes into the routing domain. Once the chassis becomes active, it continues to process existing AAA services and subscriber sessions that had checkpoint information, and is also able to establish new subscriber sessions.
When the primary chassis is back in service, it sends Hello messages to the configured peer. The peer sends a response, establishes communication between the chassis, and sends Hello messages that contain configuration information. The primary chassis receives an Hello message that shows the backup chassis state as active and then transitions to standby. The Hello messages continue to be sent to each peer, and checkpoint information is now sent from the active chassis to the standby chassis at regular intervals.
When chassis switchover occurs, the session timers are recovered. The access gateway session recovery is recreated with the full lifetime to avoid potential loss of the session and the possibility that a renewal update was lost in the transitional checkpoint update process.
Configuring Interchassis Session Recovery (ICSR)
Important: The ICSR configuration must be the same on the primary and backup chassis. If each chassis has a different Service Redundancy Protocol (SRP) configuration, the session recovery feature does not function and sessions cannot be recovered when the active chassis goes out of service.
This section describes how to configure basic ICSR on each chassis. For information on commands that configure additional parameters and options, refer to the Command Line Interface Reference.
Caution: ICSR should not be configured for chassis supporting L2TP calls.
The procedures described below assume the following:
For more configuration information and instructions on configuring services, refer to the respective product Administration Guide.
For more information on configuring the AAA server, refer to the AAA Interface Administration and Reference.
BGP router installed and configured. See the Routing appendix in this guide for more information on configuring BGP services.
To configure the ICSR on a primary and/or backup chassis:
Step 1
Step 2
Step 3
Step 4
Optional: Disable bulk statistics collection on the standby system by applying the example configuration in the Disabling Bulk Statistics Collection on a Standby System section.
Step 5
Step 6
Save your configuration as described in the Verifying and Saving Your Configuration chapter of this guide.
Configuring the Service Redundancy Protocol (SRP) Context
To configure the system to work with ICSR:
Step 1
Step 2
Step 3
Step 4
Step 5
Save your configuration as described in the Verifying and Saving Your Configuration chapter of this guide.
Creating and Binding the SRP Context
Use the example below to create the SRP context and bind it to primary chassis IP address:
Important: ICSR is configured using two systems. Be sure to create the redundancy context on both systems. CLI commands must be executed on both systems. Log onto both chassis before continuing. Always make configuration changes on the primary chassis first. Before starting this configuration, identify which chassis to configure as the primary and use that login session.
configure
  context <srp_ctxt_name> [ -noconfirm ]
     service-redundancy-protocol
        bind address <ip_address>
        end
Notes:
Configuring the SRP Context Parameters
This configuration assigns a chassis mode and priority, and also configures the redundancy link between the primary and backup chassis:
Important: CLI commands must be executed on both chassis. Log onto both chassis before continuing. Always make configuration changes on the primary chassis first.
configure
  context <srp_ctxt_name>
     service-redundancy-protocol
        chassis-mode { primary | backup }
        priority <priority>
        peer-ip-address <ip_address>
        hello-interval <dur_sec>
        dead-interval <dead_dur_sec>
        end
Notes:
The priority determines which chassis becomes active when the redundancy link goes out of service. The higher priority chassis has the lower number. Be sure to assign different priorities to each chassis.
Enter the IP chassis of the backup chassis as the peer-ip-address to the primary chassis. Assign the IP address of the primary chassis as the peer-ip-address to the backup chassis.
The dead-interval must be at least three times greater than the hello-interval. For example, if the hello interval is 10, the dead interval should be at least 30. System performance is severely impacted if the hello interval and dead interval are not set properly.
Configuring the SRP Context Interface Parameters
This procedure configures the communication interface with the IP address and port number within the SRP context. This interface supports interchassis communication.
Important: CLI commands must be executed on both chassis. Log onto both chassis before continuing. Always make configuration changes on the primary chassis first.
configure
  context <vpn_ctxt_name> [ -noconfirm ]
     interface <srp_if_name>
        ip-address { <ip_address> | <ip_address>/<mask> }
        exit
     exit
  port ethernet <slot_num>/<port_num>
     description <des_string>
     medium { auto | speed { 10 | 100 | 1000 } duplex { full | half } }
     no shutdown
     bind interface <srp_if_name> <srp_ctxt_name>
     end
Verifying SRP Configuration
Step 1
Sample output for this command as shown. In this example, an SRP context called srp1 was configured with default parameters.
Service Redundancy Protocol:
----------------------------------------------------------------------
Context: srp1
Local Address: 0.0.0.0
Chassis State: Init
Chassis Mode: Backup
Chassis Priority: 125
Local Tiebreaker: 00-00-00-00-00-00
Route-Modifier: 34
Peer Remote Address: 0.0.0.0
Peer State: Init
Peer Mode: Init
Peer Priority: 0
Peer Tiebreaker: 00-00-00-00-00-00
Peer Route-Modifier: 0
Last Hello Message received: -
Peer Configuration Validation: Initial
Last Peer Configuration Error: None
Last Peer Configuration Event: -
Connection State: None
Modifying the Source Context for ICSR
To modify the source context of core service:
Step 1
Step 2
Step 3
Step 4
Save your configuration as described in the Verifying and Saving Your Configuration chapter in this guide.
Configuring BGP Router and Gateway Address
Use the following example to create the BGP context and network addresses.
configure
  context <source_ctxt_name>
     router bgp <AS_num>
        network <gw_ip_address>
        neighbor <neighbor_ip_address> remote-as <AS_num>
        end
Notes:
source_ctxt_nameis the context where the core network service is configured.
Configuring SRP Context for BGP
Use the following example to configure the BGP context and IP addresses in the SRP context.
configure
  context <srp_ctxt_name>
     service-redundancy-protocol
        monitor bgp context <source_ctxt_name> <neighbor_ip_address>
        end
Verifying BGP Configuration
Step 1
Verify your BGP configuration by entering the show srp monitor bgp command (Exec Mode).
Modifying the Destination Context for ICSR
To modify the destination context of core service:
Step 1
Step 2
Step 3
Set the subscriber mode to default by following the steps in the Setting Subscriber to Default Mode section.
Step 4
Step 5
Save your configuration as described in the Verifying and Saving Your Configuration chapter in this guide.
Configuring BGP Router and Gateway Address in Destination Context
Use the following example to create the BGP context and network addresses.
configure
  context <dest_ctxt_name>
     router bgp <AS_num>
        network <gw_ip_address>
        neighbor <neighbor_ip_address> remote-as <AS_num>
        end
Notes:
AS_num is the autonomous systems path number for this BGP router.
Configuring SRP Context for BGP for Destination Context
Use the following example to configure the BGP context and IP addresses in the SRP context.
configure
  context <srp_ctxt_name>
     service-redundancy-protocol
        monitor bgp context <dest_ctxt_name> <neighbor_ip_address>
        end
Setting Subscriber to Default Mode
Use the following example to set the subscriber mode to default.
configure
  context <dest_ctxt_name>
     subscriber default
     end
Verifying BGP Configuration in Destination Context
Step 1
Verify your BGP configuration by entering the show srp monitor bgp command (Exec Mode).
Disabling Bulk Statistics Collection on a Standby System
You can disable the collection of bulk statistics from a system when it is in the standby mode of operation.
Important: When this feature is enabled and a system transitions to standby state, any pending accumulated statistical data is transferred at the first opportunity. After that no additional statistics gathering takes place until the system comes out of standby state.
Use the following example to disable the bulk statistics collection on a standby system.
configure
  bulkstat mode
     no gather-on-standby
     end
Repeat this procedure for both systems.
Verifying the Primary and Backup Chassis Configuration
This section describes how to compare the ICSR configuration on both chassis.
Step 1
Enter the show configuration srp command on both chassis (Exec mode).
Verify that both chassis have the same SRP configuration information. The output looks similar to following:
config
context source
interface haservice loopback
ip address 172.17.1.1 255.255.255.255 srp-activate
#exit
radius attribute nas-ip-address address 172.17.1.1
radius server 192.168.83.2 encrypted key 01abd002c82b4a2c port 1812
radius accounting server 192.168.83.2 encrypted key 01abd002c82b4a2c port 1813
ha-service ha-pdsn
mn-ha-spi spi-number 256 encrypted secret 6c93f7960b726b6f6c93f7960b726b6f hash-algorithm md5
fa-ha-spi remote-address 192.168.82.0/24 spi-number 256 encrypted secret 1088bdd6817f64df
bind address 172.17.1.1
#exit
#exit
context destination
ip pool dynamic 172.18.0.0 255.255.0.0 public 0 srp-activate
ip pool static 172.19.0.0 255.255.240.0 static srp-activate
#exit
context srp
service-redundancy-protocol
#exit
#exit
end
 

Cisco Systems Inc.
Tel: 408-526-4000
Fax: 408-527-0883